Semi-automatic Refinement of the JMdict/EDICT Japanese-English Dictionary

نویسندگان

  • Francis Bond
  • Jim Breen
چکیده

The JMdict/EDICT Japanese-English Dictionary is a freely-available dictionary distributed in XML (JMdict)and text (EDICT) formats. It is widely used as a source of lexical material in dictionary systems and text-processing projects. We propose two refinements to make the dictionary more computationally tractable: marking entries where the English is not a translation equivalent and expanding contracted entries. We then propose and apply semi-automatic methods to refine existing entries. The resulting dictionary is shown to be more suitable for the construction of machine translation rules.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhancing a Dictionary for Transfer Rule Acquisition

The JMdict/EDICT Japanese-English Dictionary is a freely-available dictionary distributed in XML (JMdict)and text (EDICT) formats. It is widely used as a source of lexical material in dictionary systems and text-processing projects. We propose two refinements to make the dictionary more computationally tractable: marking entries where the English is not a translation equivalent and expanding co...

متن کامل

Term Selection Term Selection Query - language Term Translation Doc - language Term Selection Term Weighting Term Matching Term Weighting Term Matching

This paper presents results for the Japanese/English cross-language information retrieval task on the NACSIS Test Collection. Two automatic dictionary-based query translation techniques were tried with four variants of the queries. The results indicate that longer queries outperform the required description-only queries and that use of the rst translation in the edict dictionary is comparable w...

متن کامل

JMdict: A Japanese-Multilingual Dictionary

The JMdict project has at its aim the compilation of a multilingual lexical database with Japanese as the pivot language. Using an XML structure designed to cater for a mix of languages and a rich set of lexicographic information, it has reached a size of approximately 100,000 entries, with most entries having translations in English, French and German. The compilation involves information re-u...

متن کامل

Word Usage Examples in an Electronic Dictionary

This paper describes a project in which the Tanaka corpus of matched Japanese-English sentence pairs has been linked to the WWWJDIC online Japanese-English dictionary. The process of linking the corpus is described in detail, as well as an analysis of the word coverage, and the editing of the corpus to remove some of the errors it contains. The paper concludes that the Tanaka corpus can success...

متن کامل

Automatic Integrated Dictionary Systems

0. Introduction 1. AidTrans Project 2. I.D.S. Japanese Reading Course 3. Multiple-path Predictive Analysis 4. Sentence-for-sentence Analyser 5. English Output 6. Other Language Pairs 7. Japanese Script I/O 8. Automatic Integrated Dictionary Compiler 9. Automatic Integrated Dictionary 10. Man-aided Translating Machine 11. Japanese-English Teaching Machine 12. Japanese-English Scientific & Techni...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007